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ABSTRACT. ]l is prcacscd Ihst (he 3-D representation pi in object is based primarily on a 
sticK-figure conjuration. where each slick r#pnasBnl& one or more axes in She object’s 
generalized cylinder representation. The loosely hierarchical bes-c; r ipl ion of a click figure is 
interpreted by a special-purpose prpte$i0r h ab!e 10 maintain two vedors and the gravitational 
vertical re-diva: to a Cartesian s paet-f rime. ]t delivers ntormatiori about Ihe appearance of 
thBse vectors, which heips the syslem fa rotate its model into the correct 3-D Cnenlation 
relative- to the viewer during recognition^ 

“ * 

This repori describes research done at Ihe Artificial Intelligence Laboratory the 
Massachusetts tn-slitule of Technology. Support for the Laboratory*? artificial intelligence 
research is provided in pari by the Advanced Research Projects Agency of the Department Of 
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Sumntnrj 1 

1. It it observed 1 hpi Ihe genersliicd cylinder rspresenlalion oF a 3-D object generates two 
tliit ccF proo ems^ aesc'ibing, Ihe cross-sec lions associated with each axis, and representing 
tihe relative d s positions of the axes in -spate. 

2. The second problorp amounts t$ representing tne spatial arrangement of a ttitK figure, 
method for dpiing this it given. 

■ 3. The slich-f gyre s descr bed by a oasely hierarchical assertion a I database, cabled a 3-D 
made!. Use Gt this database is flexible, and It can support levels of description Ibol cover tho 
Speotrym from a very coarse overall summary to very Fine delei! of ore smail part. 

A. Jr order to be used, a 3-D mode has to be interpreted through ar. (essentially) anal ague 
mechanism, ca iad Ihe image-spate processor, ]n !f« minimal implementation, this processor 
marntarns a representation 01 two directions (ca led jam and fi^nrt in a Cartesian space-" 
frame, in additional So the gravitational verlicat. 

5. The image-space processor’s instruction set is small, ]t? important functions ares 

sotting tho Sands to one Of the space-frame's 3 aaes or to the gravitational vertical; 
tb) spiting tho SspaaaT to an arbitrary orientation relative to the Sav'S, this indudes 
the ability to rotate the Sis o as a r about the Sax is; 
vC!i SOStmg Ihe Saxis to tne orientation of the Sspasar, and 

{d) rotating the space-fraire aoout four cislinguished axes, its three coordinate-axes 
end the gravitational vertical, Hn a minimal implementation of the image-space 
processor, the position of the Sipatar would have to he reconstructed after s frame 
rotation, rather than being rotated with it) 

5. The image-space processor tan deliver intnrmalion about the lengths and orientations of 
the projectiOns ot Ihe S-avts a^d Sspasar onto the iftiige plane, These help the system to 
“rotate" its model into the correct 3-0 Orientation relative to the viewer, Sdm* evidence is 
given that this can. be Carried Out by a process of relax a I ion. 

7. Fahlrrtjn’s Symbol-mapping sroolerr. is deal I with by dividing it into its component problems, 
and using special techn ques tor each component. The problem of indevng, ter re-cdgnilion is 
discussed. 

3. l! is Observed that this theory m*y help to exola n various aspects ol the psychology of 
hymen vision. Those include the “mental rotation" experiments OF Sties ard and his 
collaborators, and the clinical disabilities described by Warrington & Taylor £1^73) that follow 
right parietal legions. 


The two current ideas for representing I hr«—dimension al structures are the 
genera ized cylinder rtpr&sen’EliOn proposed by T, 0. Binfcrd and implemented by Agin 
(L973>, NevsEia (197^),, and by Hoibrbarh (1375->; and She "multiple view" ropreaenleSidri 
{Minsky ]975|i, The generalized cylinder representation of a structure is obtaiinod by 
Specifying its skis and the cross-seelicn at each point aipnj il, Agin and filevstia used a laser 
reii^e-finding technique So obtain Ihc geierelized cylinder representation of such objects as a 
barbie doll* a Shako, and a horse. HeLlerfcaeh slue ed toe representation of a widfr rang* ot 
pottery, .ne multiple view represenlalion js based On the insight lhal if one chODSes ones 
primitives COrredly {e.g. tine "side' of a cube), I he number oF aualilatioely different views of 
an object may be cuite smell, A number ol important questions of detail remain unanswered 
because this. iPua oas not yet been implemented, and it remains to be seen- whether a theory 
can be buift upon if. 

The general ijecl Cylinder represent ion introduces (wo main problems; Obtaininp 
the axis and a description of the crpss-5#tt'0n of the ditFerenl parts of an object (arms, legs, 
torso), and representing Ihe spatial disposition of ihe components Ihus obtained. The second 
Of these problems has hitherto received no alien I ion, and it is Ihc Pro thal we address here. 
To s«lve il, orve has to tschle directly the prOb em Of representing tho oesitiopfi of items in 
throe-space, end this article presents p method for doing it which we otrhevo may be frf 
interest 10 ex Deri mental psychologists. 

■Src-ErnurJit a/ jJif prcitiId IN 

The principle ei modular design it central to the vision system Of which this 
arllcte ttcscrioes a part. For example, Ihc processes thal define a place-token in an image arc 
almost indepcrdcnl ot the processes thal subsequently group them (Marr 1975>i the 
processes that se ec! items !□ be tested Ter symmetry are similarly independent pF (ho 
routines that deled it (Marr 197b}; Ihe eulraclion of a lOrm From Ihe primal sketch Is often 
independent of ihe processes thal describe that form {Marx 19?5}j and much of the 
segmentation of a form into its generalized cylinder description can aoparently proceed (to a 
fifst apprpyimalion) inoepenoently of knowledge aboul whal that form is ft/arr & Valan, to 
appear). The reprtccnlaliOn of the three-dime n:i.anal structure of an object using generalised 
cylinders can a*so be apl t into the iwq prob ems mentioned above, and cur first proposition is 
that the two problems are dealt with by sc pa rale modules. One module computes the 
Uestriplupn of the shape pi each component, and another describes Ihs relative spatial 
dispositions of those components. 

From Shis praonsilion, it follows that describing the spatial disposition Qt parts 
■at an object Of animal may be refuted tb The problem oF describing the dispositions of the 
okcj that occur m its genera ized cylinder description Thus for animals, our problem reduces 
to .hat of describing stick f ^ures - models made out csf pipe-tleanprs, pne For each axis (see 
figure 1). Tlit vision system being construe tod al Our laboratory is already cap a bio cf 
compuSirg this description from a raw inn-age in simple esses, 

The problem then :$ to represent the Ihrse-dimensional configuration of a 


FIGURE I. T,ic theory asserls thal Ihg 3-D represent* lion of a shape is decomposed into 
parlsj tb? description of the cross-sections ]h*t Occur in the shape's generalized cylinder 
representation, and the disposition ol 1h E ares of these cylinders in spate. The theory deals 
■j-iI- the second problem, which is essentially Ihc problem ol destribrng slick figures. The 
shapes in these fklures were -rade out Ol pioe-deancr*. The r*»der w j|j have no 1 rouble in 
recognizing Ihe S irsfre, deer, rabbit and ulrich. That Ihelf recognilion It s o easy maK ss it 
reasonable lo suppose tntf at some S-Unge. we ourselves decompose the 3-D rep re station 
problem into similar components. 
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F!nU*E 2. This is 1 he ra-w dvt r provided 1o cur system Irpm Ihc infermpdiale visual processor. 
]t consists -chi a c-OllEc'ion of imagels which art de-srr plie ns oF individual generalised cylinders 
found in tne image. Elach im,nrg. has 1 wo end pp nls m lhp irogg.? plane and cptipirilly a shape 
prpperSy such as. SsticK which s^pphes g-ddihonal inferinalicm about th# image 1 ! such as 
average thickness, rpundedness, flatness., and so on, 
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FIGURE 3- This diagram jummari^ei our overall view of the recognition problem. The 
imporlant points for the present arlitle are (a) I hat 1 he representation ul 3-D m-poets rs quite 
separate From !iir rep-priRntalion o' fund ion* semantics (b) indexes exist that give neb 
accesE to 1 unci ion al semantics pointer; rrotn descriptions at every stage alter the separation 
ot tige-re Srcrn g’OuPidr a^d (c) lh*t lor difficult images, consider a Die interaction may have to 
tar.e place between the description of a> form, the image-space processor, and the 3-D model 
indexer Defore an appropriate 3-D model i; Found. When one is found, e tall an&thor ttep {of 
relaxation) may be nccottpry bclnrc Ihj fynclitinal semintict indexer acquires enough 
information To recover the correct pointer, 

















&ltiired stick figure to as to relate Iht angles and appoint lengths found in an image line 
^ 3 mrt £ to Ihe torc®-di.niena;pnai stiuclure a I thf Object and the perspective from which It is 
beirig viewed. We wanl a. solution that in some sense minimizes She computational complexity! 
t>yf nol necessarily the eomputalionel power, of the machinery required to implement if. 

Htickft6uiid: recognition ii hu( nri aiJ-pr-jwir* pj'HXttf.ii 
tM-fdre giv ng an outline of the theory, we need to make Ewo general points 
lhat have deeply influenced 1h e way we approached toe recognition problem, The First is Shat 
Ibe stored Ihree-dimension 3 ! representalinn of an ODject is separate from the represenfation 
Of ils- functional semantics. This artkte deals only with Ihe three-dimensional problems that a 
horse irots, ga lops, cals grass,, can be ridden, and is Table to kick are not represented boro, 
11 is very reasonable to keep the two separate, because a living horse differs in a 
■ ur,darnental way from a statue of a horse, despite its similar geometry. Nevertoelcss, there 
are grounds for thinking lhal the top-tovst token that organs ihe functional semantics of a 
horse is !ho One that is closest lo Ihe I'nguislic label 'horse", and part al what we mean by 
recognition is Lhe ahil.ty to address this pointer on viewing, an image. 

Which brings us to our second point. U is Often Ihe cate lhat Ihe functional 
semantics pointer can be acquired quite early in the analysis. Ol an image - many simple end 
definite cues exist lhal can be extracted before a 3-D destnplton has been built Any 
indexing strategy for last recovery of th* J uncLicnal semantics pointer would certainly tike 
advsnlagp of Ihis, wh.(h .means being, sensitiva to descriptions at every stage alter figure- 
ground Kraaral or.. The variety of cues lhat art available in most images probably means that 
only rarely will one have to proceed a I Ihe way to 3 3 -Q description bet ore 3 initch in the 
dataoase is Found. Endeecf, we would expect this Id happen only when the object it being 
seen from a deceiving perspective, or wften the prevailing illumination is unusual, Et is quite 
easy to design flexible indexing techniques, that can Tia-,e use of dues OF diverse hinds from 
different love is, 

Cur overall picture of the recDgnit on qrookm is illustrated In figure 3. This 
makes dear our belief 5hat there are many paths to toe functional semantics pointer, some of 
toem fast but not necessarily available 1 rom every image, and others toil arp slower, but 
Which usually guarantee re&ullt, [r, 3 penetrating analysis, Warrington ft T*yJ{jr 
concluded that Ihe 3 -dimensional description and (he functional semantics of an Hem are 
represented in dislind cortical areas. Their evidence fur this assertion is double dissOciaLiOn 
between the Iwo kinds Ot detitil p observed in pa 1 ien | 5 with ieil (for semantics disorder;) and 
rig.-it (for disorders of Ihree-dimcr-sional representation) parietal lesions. 

This article is concerned with only one small part of figure 3, namely the 
construct on o - a 3-0 mocel Ot Ihe disposition oF Ihe ayes in spaed, tn order to be convinced 
|ha. Ihis is cert-a nly one cf Ihe possible paths to Ihe functional semantics pointer, one has 
only to look a I l.gure 1 , Pipe-c eaner arvmals eyh b t only the lengths and disposition; »{ Kioto 
- axes, yet wd nave rvo trouble recognizing the giraffe, reboil or ostrich in this figure. 


Outline erf [hr fhiwj- 

There ar e four main components Lo the IhoOry. W* give a brief description of 
Iherm first, so Eh?! the reader has an Overall Iran-eworJi within which to fit the cielails. 

•A1 some itj E5 in I he represent afion of ihree-dimensiorml space, one needs a 
primitive sfrlity to represent a vector fi.fr. a direction and a lrn E lh>. Accordingly, the first 
component of the theory is a processor that provides th. B primitive ability. ]t ,s railed the 
imaffir-tpat’t i*rec»uhr, and ft can main I sin I wo Conner led vectors within a supporting space- 
^rame. Those two veclors are called the S«h and She jfjpow ( in abbreviation for space- 
arrow). The processor can translate Ihe end of the fcpasar lo an assigned pinion on the 
iaxis, can rotate il around the Sams, and ran, roller it in the plane containing both vectors. In 
this wjy„ the 9spasai can be brought to »n arbitrary relalion with the Saxis. 

in add.tjpn lo these facirilies, the image-space processor ten move th 0 Jsjcis lo 
wherever the current Scalar happens lo be. and it can act ,s though a small number OF 
soate-frame relations could be performed. We do not regard the space-tram* rotations as 
Irue extra facilities, because Ihsre are ways ot simubhng them using only lh* Saxis and 
SSpasar. 

Tne usefulness of the processor for recognition arises from (he fact that Si 
well ,15 maintaining the thru-dimensional relalian between Ihe £,?i$ sn d the Ssp-asar, it can 

compute 'he lengths, diretl id ns, and angle belween their projections. The computational load 
attached Id doing this is sms 5 |, 

The tec-Ond component ot the theory is a propositional database that 
repre«rtlfr hy assertions useful three-dimensional relations between the axes pi the Object* 
being viewed. The dataslructure tor a- single physical object is tailed a 3-D tnodrl, and its 
purpose ie lo explain every image eiemenf delivered by earlier visual processes, up to « |ev 0 p 
of Oe'.an apprOpriafe to (he circumstances. There are two important points about (his 
datibsse. Firstly, its or e anization is loosely hierarchical. St can provide descriptions of parts 
0f ar ' Db ^ ct lhal * &V0r fichl V spectrum From a coarse, une-aiiit description ol a whole 
object, to a fine specif nation of one small pari of it. For example, a! th* top level, a horse 
may be represent ,ble as a single, horizontal axis. At a lower lovet, Ihe two forelegs art? 
Sn?alBd as a single axis, At Ihe nc«l stage, Ihis description decomposes So Ihe Ifrft-t&reieg 
end the right-foroiegs and further down, Ihe single axis description of the left-foreleg 
decomposes Ip two, splitting at what the layman would cali the knee. Figure 4 shows sbne of 
the ways m which- a■ lypical animal dataslructure could be decomposed. Th* second point is 
lhat, threo-dimeos.o n3 | positions are repressed by local relations between adfacanl parti of 
a body, mot by absolute coordinates in * circumscribing frame pf reference. Thus the position 
31 a toe is stored relative to a tool, which is stored relative to a le E[ which in turn is stored 
relative 10 the torso. In order So discover the relationship between the head and tfa* Ids. 
thoTe intermediate relations have to be examined. 

The third component of the syslem is ihe mferpraer, whose jub is to create 
and maintain the interface belween Ihe database, the imogn-space processor, and the 
information being delivered from the image. Th* interpreter is capable of reading Ihe 
•Mercians in 3 3-D model, binding Ihe Ssxi 5 and Sspacjir in (he image-space processor tD 
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F'OUiiE The represenlat on ct a slicK figure is IbPsfcly hierarchical, ar.d allows descriptions 
TP be created at many levels ol detail A.1 Ihc t&p level, a horse may be represented as a 
srn.'E.le, horizontal axis (to answer questions line "Where is Ihe horse poinlinj? -1 ), Tp answer 
tne -qujesSi&n “Where is its front left hoof pointing?", top icit foreleg will have been unpacked 
to a considerable oeg'ee of detail, while The hindlegs may sli I be bound to nothing finer tnan 
a single H3N0LEGS axrs, This frgur* shows some of Ihe ways in which a Typical ANIMAL 
o&tastructure may be be composed. 







FtGUfiE 5, Thr* figurfc, taken from Minsky & Paper' (1972J, illustrates the influent* gt an astis 
on Ihe cssc notion oF a f gura. in or* row, "ha snipes arc seen as squares,, and in the ether, 
as diamonds. The estaolishi^-g, frF arcs in i f-^iraensisrtal tig^re is important Idr Our theory* 
since it deterimr^a how I he description of a £-D coniiguration is constructed. This hgurs >t 
Ihe 2-D analog tf figure 1. since il establishes that one precondition For using our theory at a 
psychological model - namely Fhe computanoo of sues during tine analysis of 2-0 patte-rins - is 
salisf ed by our visual syslems. 













SPACE-FRAME 


FIGURE 6. The principle directions ©e Fined in the teul are displayed Er^phicalljr m the figure^ 
The imaEC-space processor is capable *1 snr-ula‘ ng rotations abcu! Sup, Slront, Sh.Qrijcantal 
and Sverticai, Atler Eucb a relation, Ihe position of the Sspesa-r tand possibly aitc of She 
Saseis} must be reecns true led from adjunct relations m (P# 3-D medet. 



appropriate mej in the 3-D Riedel, and causing these vectors to be rotated unlil Ihey hiv« 
the sime 3-D relation to one another as is specified in Ihe model* Tho interpreter can 
compare the resuming vector wilh an image element, and can report on *ny discrepancies 
aetween predicted and measured properties of the image. Various global variables are set 
and read by the interpreter; they include a certain degree OF translational Freedom and an 
overal ncaie factor,, ealied Ihe SiroJp, which governs the region belween the size property 
Of an item in Ihe dalabase and the length of the Sspasar 1o which it gives rise. Decisions 
&Odu1 which 3'D model t* instantiate and what parls of i| to concentrate Cm form (ho 
controlling information ter Ihe interpreter, which 1he« tries to match the model to Ihe image 
by three-diinensiOral relations. 

The Interpreter's main requirement is !ha| it be able to move-around the 
datastruCtures i! instantiates in a fluent and agile mapper, This is necessary because “ha 
compotetiOnal resources in the image-space processor *rq limited to representing at most two 
vectors at once, whereas the range ot questions One reeds to pq able to ask of the whole 
system is large, For eiramp e, in order 1* answer the question "]n which direction is the horse 
pointing?' She image-space processor has to be hound 10 the horse 3-D model in a completely 
different manner from that required Ip answar "Where ii its front left hoot pointing?" at a 
particular instant during a step* Several interesting issues wore brought into focus by having 
to design a salistecfOry implements lien of the interpreter* 

The final com.ppr.enJ Of the theory te something we call the refevsrnon 
Aypotfnfsii*, Inis hypothesis states that by using the venous cues available from the image, 
including information about obscuration, lighting and support a; well iE the lengths and angles 
Observed there, it i? usually possible td align (he image-space representation o' o viewed 
Object accurately with the object's real-world Orientation, Furthermore, this may ba 
accontOLishee by a relaxation technique; I ha! s, al any instant the compensating rotation is 
macs ebout that axis (of the four available in the image-space processor) which reduces the 
largest discrepancy currently pictured. Wte conjecture that thr? strategy will converge. This 
component is still a hypothesis because we have not yet finished implementing It, Our 
expectation is,. however, Ihat its implement alien will strongly resemble thai Of Ihe earlier 
vhsuai processes with which we have bad experience - i.e. i} will consist of a considerable 
number ot specialized diagnostics that interact in a Jjirly simple way, 

]t will be evident that Ihe image-space processor is inherently powerful enough 
!□ reprint two-dimensional patterns, such as the configuration oF features bn a face. Such 
patterns may be thought CF as degenerate cases in. which the girdle-angle is zero. The only 
requirement <S (hat these patterns be described in an appropriate way in the database-. This 
means teal aires nave to be set up in the two-dimensional pattern, and the configurations 
nave to be described in the usual way relative Ip those axes. InterHlinsIft il has long been 
fcnowr, Ihat the cho.ce of an axis in an image can greatly influence the way in which shapes 
ere describpd. Figure S (Minsky & Pa pert 197Z) chows an example of (hit. important 
modium-tevtl vision modules like Symmetry-Finding (M 3rr 1976) can be thought of as helping 
to find “he axes that it is -app rap rial* te use. 


Sma? , '-xpnr.r 

begin the detailed a-scounl of I he computation at l*£ililieB. atta.cned to each 
part oF the theory by diseasing the image-Epste procs&sOr. The interest here lies in 
minimijina the computational power Shat erne ui*s. We retire that the processor maintains 
{&r simulates tne maintenance cf) six dirctFions. he>' arei 

£□13 S5PASAR, which in the minimal implementation is the only vector Shat ten be rotated. 

(U2) SAXI5, which is the vector lb which the S=pasar ,s sit ached and around which it rolalcs. 
These twC vectors are maintained in a local space-frame, which may be thought bl at being 
defined by Ihrce dictions: 

£03) SUP,, which initially coincides with "he gravitational vertical; 

£E3> SFPQNT, e.&. for a horse, the direction in which it is pO n1irs t and 
{C5J SHOREZQNTAL, which is perpendicular to £up and SlrOnt. 

Finally, because it is Sfl important direction, we need 

(D$l SVERTEQAL, which is defined by the gravitational vertical. 

The instruction set to the processor divides into Four parts. 


fPJi $j^cMr Seri* aerations 

(a) The Saxit initially coincides with either Sup or Sfronl, te.g, Sup for a man, SfrOnl tor 
a horse} and can be reset to these direction holders at any time. 

^b) The Sspasar tan be attached to the Sari* at a specified pmnt &nd rntalod around It, 

The most important three-dimensional relationship bet wean the Saxis and Sspasar is 
called an retail, ac.d it is written tyigX. P * the P^Lion on the S a x,i at 

■j^htch the Sspasar is attached;. i is the an^J= mariora of the Ssnasar to the Saxis, 
measured in the plane that contains Ihem bothj and the girdle-angir, describes the 
rotation of the Sspasar around the Sax is {see figure The Sspasar can also be 
translated away from the Sasd* according to certain rules. This makes it possible So 
represent the tact that One's arms are not attached directly to the axis ot one's tOrSb, 
but ere translated away Irom that avis. This translation is tarried out by means of an 
wmWdtajr relation ft where d is the distance and * the S irdl Bangle shown in lisure 
S. In. She datastrueturns exhibited ietef in Ihe artictat adjunct and embedding relation® 
pt 5 combined in One- expression, which, has tot lorm ip 'f jr. d c)). 

■ (cj The Saxis can be re-bound to whatever the Sspasar is currently hound Id, and in sd 
doing it assumes the spalisl coordinates of the SspSssr. 

iVZ) Spucc-/riifflo AperGiionJ 

The space-frame may be staled about any oF the four directions &vp, Sfrfir,!. 
Shorizontal and Vertical <se* figure H These operations are called respectively ™I«L, SPEN> 
TEl_T and VRDTATE. In a minimal implementation, which is interesting for reasons we shall 
discuss Utsr, executing these rolaliOnt would use ihe same machinery lhat rotates the 
Sspasar about the Saxis. Hence executing 0 spate-1 rane operation would cause the current 
Sspasar to be lost, havina to be reconstructed a«1er the Frame SransSormatibn. If the taxis is 
not aligned with an axis of She space-frame, it too will have to oe reconstructed. 3ec ^ e ol 
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FIGURE 7r Th# me si important 3-D relationship is railed an adjunct relation,. (p, t, /(). The 
poL'ticn p, the inclination i and the girdle-angle g arc earned in the levt, and illustrated here. 
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FIGURE 9- The orientation dl Ihe mcrnji coordinate sysl^-m dspeevds. on the angle of £-az«i. In 
order to minimize “he cOmp-utSlicnil power required to compute projections in the image 
plane, a vector'^ component Towards Tne viewer is represented separately from its component 
in the tanged plane. As Ihe w-jt of St-e changes, Ihe coordinate system it rotated restive 
to Ihe Outside world, ]f tne viewer V straight ahead, the internal coordinate system 
used is shown in the figure as (u, t-, If the viewer moves hi® tjaze to ascension □ and 
cectinalion the coordinate system is related -d aoovt Ihe w *wis f and -o abddS Ihe new 
direction of the u axis, resulting in the frame ’x, y, z) illustrated here. In this way, no 0*tri 
compulation is needed at run time 10 determine tne projection ol a vsclOr in ihe image plane. 
In order tor this 10 oe possible, the image-spate processor ri«ulr« accurals intormation 
about the direction of the angle &! gaze r*Litive Id the gravitational vertical. 



Ibis, it is nearly atway* advisable to ssl up a new Space-frame when tte 5 &k1i r$ rebound to a 
new part of the 3-D model. [1 t 5 worth emphaaizi.nj thal in ihis theory, information about ihe 
cu^renl di red ion of the gravitational vertical restive to the viewer plays an Important role in 
defining the internal re present a liar, at ihe -spatial disposition of a 3-0 model. 

{PdJ CeJhpLiiiJijr projttcj twit 

In order to compute the appearance of a vector, one has ha Know the viewing 
angle. Thtf natural geometry associated with an imagi-iiR system is spherical. When (hp image- 
space processor computes (he lengths and angles associated with the Saxis and Sspaser, it 
therefore has Id take account of the direction of gaze. This can toe accomplished in two ways f 
either an -initializing pair of rotalia.ns is made, about the Sup and atotoui the Efrcmf, equal and 
opposlt* to She asconsion and beelinaUm at the viewing an&te| or “his transformation is 
Curled tout ton each vector just before its projection is read off. 

The first melhacf is jimplcr, H has the virtue that if the Space-Frame is actually 
repress-ved n an nlernal coordinate system that specify the component in the direction 
towards Ihe viewer independently Of the cCOTpewnE in the tangent plane fsee figure Si, (ha 
required projections are available wilhout ectra computation, Our present system is 
implemented this way, 

(Pi-) TrrtJtsfcii on 

Alter several changes Ot 3a*is and Sspisar, tore can find th?] the vectors ere 
btoirg constructed a considerable dislanfe away from, Ihe origin Of Ihe image-space, This is 
not a problem for a computer implementation dF the theory, but ft would have tto be 
COfUiaered carefully if the (haory were conslruccf as a psycho og;ca: modal. 

TAn dnJ-fffrttje tend fix ijrl-rrjMrcrer 

Tha overall picture ol the dfttttrudune that is stt up ter a particular image it 
□ F twenty or thirty independent altoms, each specializing in the ctescriptian oF one aspect d! 
the irr-age, far example, atoms night be set up ter the head, toriO, tail, Forelegs left-fareleg, 
hoch, e FI-front-hoof, horseshoe, neck, mine, and pair-df-ears. Other atoms describe the 
Shape of Ihese items; far erampla, Ihe taif mjy acquire the description, "slick'', Ihe tarso, a 
"cylinder;. Special atem* wll describe the texture of the tail, Ihe colours of She various parts 
(the specific colours that Occurred here), and She sheen on Ihe animal 1 * coat. 

tach oF these atoms names a particular (yp c oi datestryttunj, and contain* on 
ilt property Ms! ter the default extension of its property-list) values and relations appropriate 
to tnat type. Far example, a torso-axis-atom has adjunct relations with the axes that connect 
10 it, and a pointer to an alam for hnp lorsa’s shape. The shape attorn, "cylinder-shape", 
Spacilies tine ongth and wicflh OF (h« Cylinder, and points 10 any modifiers it may have - like 
bumps on if, whelher it is flattened, and a description of th B director, and degree to which ij 
is conical. 

After scrutinizing an in-age, ii will bo evident that the- resulting dal astru-c tore 
Can become very large. Finding the required informal ion in it - Le. evaluating a reference 


within it - is there Fore a major pr*b em, We approached the problem by designing a system 
in which freedom 01 reference is a basic system facility., and by dece-nirahzing as fa* 
possible the rnpexes Ihat support Ihis Ireedonr. for e sample, Ihe lop-level HORSE atom may 
be accessed directly if tne system is asked to evaluate the reference Shorsei but (he 
poinlers to Ihe parts 01 that bOrse are kept in » subsidiary inoex at Ihe particular horse atom, 
Thus in preier t* d sedver Che alOm thal stands I or this horse’s tail, the atom for the horse 
must be interrogated. The h&rse-atOm tbfrretore acts Ike a Intel index - a managbiT'tnt 
function that is superimposed upon its dala-storage *.unction - and we cell Such a local index a 
pnckri-c, Seine a packet s mandalory tor phys ca Objects (RHvSCSt), but optional lor lessor 
structures. Any a lorn can however become a pathd, and it does so precisely whenever the 
uicww'i rle'es 1 r me is suslai-ed long, K*oug> : or I lie pieces cl desz r iol on that l-e 

"below" that atom to become inslantialcd. It ii also possible Id create a sachet that OT&anires 
in a new way parts of a description lhat have a'ready been instantiated, 

The tyinhal-mapyiiHfi fwkh mi 

In order to construct mechanises lhat would allow these things to happen, w« 
had IP deiiign solutions to a complex el problems tnat h-ave come to be known as tymlntt- 
mnjjpircjj {Fuhlman 1975, McDermott 1375-J. When he first introduced the term, Fahtman meant 
the phenomenon whprpqy be mg, lold that Clyde is an elephant cipses much general 
knowted^e about elepnanls to apply specifically to ChydP Because much dF tho rest of 
Fahlman’s discussion concerned recognition, there has been some confusion about the 
relevance of this phendra-nQfl lb recognition. Thorp is no a priori reesOn why lh-0 two should 
be conceded at all, There are in Fact tour quite distinct problems involved (Marr )975b>* and 
they are a(l important tor systems lhat ust- stored desertplions tc account t&r new data. he 
lour problems are; 

fa) The one Fa hi man crigina'ly addressed, npmn y the application of general knowledge about 
elephants 10 Ihe specific inslance Clyde, which occurs when it becomes knowp that Clyde is 
an elephant, We {halt reler to this as Ihe pttpwi y-inAjerifon&o problem, because here "he 
tas^ is to map propert-es held in Ihe database onto a specific instance of a known- class. 

<h) Suppose that £lyu* the elephant has already open mentioned in Ihe current environment. 
Given only the reference AriJMAL,Or LARGE GRAY OBJECT, or PEANUT-EATER, use lh«s Sb find 
Clyde ml hi database, thal* tail this the TtftfCHtt-viinAoM pr&b em, since in some cense 
the issue t how Wide to mane Ihe window through which you can access a current item by 
describing Its properties. 

(c} The r e f e renc e -window problem merger I nip Ihe third problem, thal of indexing far 
rnrojrp&ion, bul there is a distinction that is probably worth preserving. ]n the reference- 
window problem, the items that are Id be referenced are already present as instances in Ihe 
current environment. In retag nil ion, ihe pr&b em is to access a suitable template from Ihe 
database with which to deserbe some incoming information. Recognition remit* in an instance 
Of some template, whose Own reter®nti«—wmdew problem then begins. 

(d) The reference-window orooletr. meefs the --eccgnilten problem somewhere near where 
(hay Doth turn into the tourth problem, the af rnr.all. The -diFteren.ee pel ween tho 
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IGURE 30. Our iflterna? r^r«Hnt«|ion of 3-0 Shapes is maintained on She property-Fists of 

7° Kina5 01 * W ^" 0C ter ^ liT0S S^nts, Templates topper £afi * Elk* mONXEY) 
&tore Informali&n aDDut *rche1ype shapes while Salon* (rum** pf ihe form 51776} ir« U5B d | D 

represent Rartifubr ins Wet pf the tempi »| B *. A third kind pf pane is (he Sretorenee 
f template names pr.f| M d *i !h * S) whicK Bpp * ar EE vaWs 0) prDp>rlias in th0 , |(t „ 

a ™ Tnefie ,re d ™P fetf pointers fh.S indicate th*l Dn * should firs! if Ihe p ar! j cu | af 
!■* wane* hat been inslantisted in th» current environment beFore tailoring through Id the 
mdiCfllpd template, 
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FJ&JRE 11. “his. figure illustrates the processes invOkvnd in asking for the Ihickness Of the 
IotSO, Si 876, Information that will he usee to answer this request is shown in {1} which is 
ptrt of the Brwironirienl al the lime of the reouest. (2) shows the slops Itfkcn to answer the 
request *nd (3> shews 1 he etFect this process has an She environment. 
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recall problem and the reference-window problem is lhal in [he recall problem,. the item (0 be 
accessed has rot yet been brou&hl into Ihe current envirorwrenl. The difference between il 
and tha recOgoilion problem is that one is ngl srmpSy recogniz ng ypndtrr ponderous gray 
object at nut elephant if ■£ a specific tisphant, namely Clyde, ihe one who ale your b*a OT 
psarulsiast Thursday. 


Sufnn dr.finUiwit 

Only problems (a} jb) sod (c) will conccro us in (his article, but before we can 
explain our soldi ions lo I hem. we ^n>ed to ini reduce the iDlipwing. definitions: 

(1) jTej)Lpfa(.n 

k lempiale, denoted by an upper case name like HORSE, is an archetype 
property-list. Values in Ihit property-list are expressed *$ Preferences (definition (3) below}, 
not a$ particular instances, 

(2i Antoni 

A SatOm, e.g., *1776, represent an instance of a template, and is created by a 
process Ca led fittpinlialian, A SatOm has a property-iis!, and Iho 1empLa1e from which a Satom 
fs derived is signalled by the ECLASS property of the Satom, Properly names prefixed by a $, 

Ii^e SCLASS, cause their values to be quoted rather ihan equaled It is roughly true that 
values not specified on the Satom’s property-list default So values On Ihe SaCom’s template 
(Raphael 1SSB p. SoJj but what actually happens dilfers in an imoorUn! way from a iienple 
defautS. Yte explain this beiow. 

(3J SrpfcfCHct: 

A Preference is a name, prefixed wilh a E, which may be evaluated (by EEYAl,} 
in :he current environment, A SreFerente Evaluation is successful if it returns a Salem. 
Typical Preferences are HAEMAL, El-QESE, SSHAPE, SCOLOP. Referents Of the form 
^rele'ervcol „ Sreferanee?) art also allowed- In such cases, the result of evaluating 
S.-clgrence2 helps Id define Ihe context in which Srel'crencel should be evaluated (e.g., 

(Storsp L Sharse} p r (Sbump , (Eneck . Shorse))], 

■W} Pittfcirl 

k packet is a Salom thal conlaipj a local inoe*, usually fer some of the 
substructures of Ihe ob|ect that the Satom represents. For example, if 51 “76 has SCLA$& 
HOREE, one Of the entries in its index might be <TC&5C SiS76>, mear-m* thal the torso of this 
particular het» w» represented at the top level by |hp $alnm 31876. Any Sstoro can 
become s packet if Ihe items lhal it organizes are instantiated. 

'.S.I $[izdi TK 1 

The Sinder is a general reference irdex. One can address it with an evaluable 
rel b rente, and if an entry exist* il will return aether relurence which wilt help in evaluating 


hs original. The disllnclptm b'lwtn t h„ «*„ ind !ht loe- „, CM tadeMt is lh>| , hB 
B "' dC “ “ ‘ P'™"™"' fa “ J y °l ssnerjl kbOwlMgs (,i k9 , dietitm.ry). wh , r .„ c „ h 
P “ tet , i S »lw.ys specific,, knd usually el eefy |, w „, f j mpl)r ,, n „. Hec.«. the S™, 

f * ° ! tmi " . . . "si' 8 e »r 1° I® it; M bees use I he lindc, is 

perrssneh. ,t csn be sliced to 8 r D * v , ry e l,b«kl e . g, preser.f, WB he6 „ JindB , 

under the iernplsles the, euneern. A Men. reprcenling , SD9c .lic inst.nre n ,y be thought 
e. st being plugged in to tre genersl knowledge eontsined in the Sindot, The plugs sre tha 
entr'es .n the Mom's drgsnisinj supc.-p.ekels, becsuse it is through these that s reference 
returned by 1h<? Roden can evaluate ‘o that Satotn. 

(G) iftn reftfithr.il riifiijiniijr 

Whin a bvitrancB Es made, Ft it evaluated fay 1h.e function SEVAL which is 
designed m accOrdanti with 1h e orin C ipi B OF least commitment. [f a reference Is made within a 

” nt f th *‘ M,B refeffrhl uni ^' |hsl ref *^ is returned. For sample, if an item p n 
che St&rso aW OF a horse refers tb Stall, Ihe reference evaluate to ffei f B ji This 

evaluation would be unaffected by the presence of an O' elsewhere in Ihe in*a* B < If however 
Ihe reference was made evlerr^ly, (lor t«air,ple by oarl of th* datssfructure for a &EJup 
reLipe^ Ihe Stall reference would evaluate 1o an ambh&uoufi pa r. Other knowledge wouJd have 
to be deployed to rasbtve th* annuity. 5EVAL makes ede^sive use of the information in the 
eyrrcTi. environment and in She Sinde*. 

Examples OF the dalastructures defined here appear in figure 10. 

TVWi ^f‘0^<’rr_ , l ^iiii | jrvjlnpift(: problem 

* , . . .. ^ Slsl9d abCv * lhat v,lue * thlt » r >? ™l Specified by the prQ P0 riy-lkt of a 

. default 10 the values on Ihe properly-lies o[ ihe Salons template. What actually 
happens dif.ers « an important way iron this. The that occur in the prOperty-Ikt Of a 

empale are (or contain) SrvUnuta, not soecifit Saloms. References cannot be used olher 
than es References, they must be reread by spedfie instance, -i* by Salorrs, befor* the 
aEsocrated tempi ale's property-lfel can ba examined Hence Ft a default 1o a Sreference 

^ 7 ^:T P T *^' UaSe5 thS ^ ferfnCe if ,h,fc - and P' K * !h * -touH in Iho property- 
° the «ateDL If Ihe reference has no refertnl in She current environment.. » 


is instantialed. 


An i*™** w '» help to clarify tNa. Suppose that S1S76 hit SCLASS Torso 

,hat tht SHAP£ ^ ld Urtder 1he TORSO ia She reference 5T0RS0-SH^PE Wtan 

we eva ua l0 the SHAPE property of 51&76, the evaiualor falls thrpo £ h to the template and 
encounRrs this Reference, [f , Saturn for this lor E( y 5 shape already misled, it would h 3 v B 
xcr e*ea under S1370, &o a new Satosi [51576 s»y> is irralanlialed from the TORSO^SHAPE 
to m p a Is, and S1S76 is sliced « the SHAPE property of SJS7S. SIS76 stands for the 
par1'c U .ar = h 3 pe of lh, £ particular torso, allhough as it is new, 61976 consists mostly of 

. . t0 l1s f ' ,Jnplal ^ Th - er ^' ^eas here are fe) that on e can discuss ppiy 
ihsUnliaRd ,Lems and not riWence*, an uns^cestfd referee evai ua tF0h r^lts in an 
■rwtaxation; and (bj thil S197G mus! b* indajfed in Ihe Sato^ SJS76 under She entry 5HAPE h 


not TORSO-SHAPE (all hough it mj/ oe inae*ed u."der TORSO-SHAPE as well}. This is so that 
Plhcr Satom$ can reTer lo it through 1 he reference SSKAPEi they (Jo nol have to know that it 
is a particular TORSO-SHAPE, One important way in which templates can grow more 
spec'Ali-Zed is by replacing general references like SHAPE with particular Ones {llhe TORSO- 
SHAPE:, Put Ihis is a rnuch laler ssue. The act 01 instantiation is a common ono, and tens aF 
Satomj are crealpd to describe even a simple irtjju, The abundance ol SalOma is one reason 
io-r the importance of the reference-window problem, 

Dul suppose one ash* l*r some property of SiS?6 that is rot specified pm the 
TORSO lbmplale, but wh ch Is specified somewhere, F D - example, we might ask what it the 
thickness of 1 her torso? This hfo'iralicn will no; be found uncer TORSO dr Si87G because it 
■concerns the SHAPE of the TORSQ, and so Is one step removed from information that is 
directly accessible through MS7$, This is the difficult pari of the property-inheritante 
problem, and it was not addressed by Raphiel (i$6B). Rather than trying to design a 
universal solution to Ihis problem Ji*e Fahlm*n [] 975| 3 we look Ihe view that the Only thing 
reeaed tp solve it is knowledge about where tne reouired informalipn may be Found. This 
knowledge is hold in Ihe Slndev, as a specific alien Of which packets Can Organize (in Ihis 
example) THICKNESS, “he Sir,dev relumj. the reference SSHAPE, which means that m Order to 
discover the thickness of $1676 we have lo interrogate its SHAPE Salorn. fl this exists, il will 
already be indexed under S1 "e SHAPE enlry. [f it does not already exist, One is 
ins'antialed. ihis causes a second 1 call !d the Sindesr to discover wnelber a special ghapo 
template c-xisls lor Ihe shape pf ? tors*. il does, and She Sindcx returns TORSO-SHAPE, which 
is the specialist's internal name, This is instantiated, and Ihe required information is then 
found. The important points here are (a) Ihe use of Ihe Sirvdex Id guide Ihe search round the 
database (lixe PLANNER IhsOremsfe and {h} once again, (hat using information from tho Jindex 
involves a Srsfarence evaluation which, if unsuccessful, can cause an instantialion. 

The reference- iri iidaia firoblam 

The reference-window problem is important h*cause tho mechanisms involved 
he-rp are what mak* il possible to apply -pne Ijmplafe (or scenario) lo many different 
inslantci, For example, a HORSE template wilt contain references to STORED, which in any 
inslance of fhp| lemplate need to evaluale lb Ihe particular SatOm for that horse’s template. 
The scenario VIRGIN EAurRJFICED TO FEARSCME-THtNG" will have internal references like 
SVJRGl! 4 ^ SFEARSOIvlE-THLNG, in ler.ms oi wh,eh inform a I ion in the scenario - the motive of 
appeasement, and the secluded irgeslion o' one party by Ihe Olhar - is expressed, In a 
particular instance, the general statements contained in Ihe scenario mutt bo transformed inlo 
specific Assertions aoout tJ.ary-Jane end Godzilla; SVIRGlN must evaluate to Mary-Jans, and 
SFEARSOVE-THJNG must evaluate to Godzilla. 

The re Terence ■-window problem it almost the inverse od the prOperty- 
] n her 11 a nee problem, and to rmpemenl il requires extra indexing. To solve it wo fa) 
decentralised Iho necessary ndo:;ng jy c is I ri bu I ir-g it among existing Faloms (this is what 
Ti.ir.cs .ham into packets); (b,i added entries lo Ihe Kindev that heip one find the local index 
appropriate lor a given referEnce; and (c) added extra Sop-level access-points where they 


prfivec useful, We have already teen several Bxamptes of (a j j b>p anf{ eKflmp | e &f fc) 1( 
thal when a KJRSE Opiate ■ s instantiated, It* Sale™ i* Cached to the Sind™ by ft» top- 
J fi ,d references WORSE and 5ANJMAL 7h e reason for 5 hi* i & Jha( much knowledge about a 
HDRSc. <S best represented a* knowledge about ANIUAU, for reports of economy. 

Ifnw fxtekc.it are tt-cnic.d 

As we mentioned earlier, the 3-D representations Ibat w 0 use are loosely 
hierarchical. The hierarchy Istich at Ihere Is) j s expressed by the focal Endftxing structure 

, (t t0n * & ." bDut blfca “ fit Satc "“ **n oHen Mum the ins Ian Malian of other Sal Oms that they 
lo indeK. This cm happen ,n several ways. One la to ash the Sfodejr for a list of 

?\ P uIL 3 S_D ni0del ' 31 rctl ™ 0f ,em P ,atB? ' the PARTS of an ANIMAL consist 
! * ^ NBCk, TORSO, FORELEGS, HINDLEGS and TA[L. The process of ACTIVATION is 

defined w (he instantratfon of th. PARTS 4 a 8te ll occurs whenever sometfong important 
happens 10 that ffofom - for axample it the Sate* is used 1o «( up * space-frame in the 
image-space processor - and it may be IhO^ht of * E the lazy man's w Sy of jmtttlialir. a ^ 

The .instantiation sgcuence is nOl remitted Id using the PARTS list) m a special 
CiiUUne., the processor cm avoid the PARTS reerpe altogether, instating only what is 
roqu.red by currenl needs for oxampie by spatially driu sn ina** accesses to what It *1 the 
rronl Of the torso or ils rear, tic. Thus it can happen that diffor.nl parts of an animal ere 
currentfy represented si suite different levels of delft. Ef the front left hoof is being 

SI— Eha,< '* fe4SOn WhV th * hinC ' leSi ih ° uld have bl3en u ™ s < k ed beyond a coarse 
HINDLEGS «acriptor. The hierarchy in 3-D model Organization is ncl strict, and the locaF 

m(forms '* not even loiisl. For ewemple, one can move straight from the torso to Ihe teft 
-foreesf it ,s no! necessary to pass through the pair-oMorelejS diteatmeturfl. The 
particular route that one chooses and Ihe amounl oF detail For which one instantiates 
Descriptors depends upon the purpose tor which the arumal is being viewed, 

Th * f0V ™ * f «»i«1ion also occurs; IhaS | s , instantiating a given la tom cert 
causo ,he Instantiation oi a Superior one thal then indexes th* Original, This'is so important 
■ at a ways Occurs unless a Sellable Superior already exists, or unless the Satom Is itself a 
, SOB {which has a top-levol status), Indexing is important because if an ifom is not 
Pfogged into the Srnoax somewhere, all retererce*. to it will rail, a -,d the Hem w|» assentiaIJy 
have been lost forever, for example, if a TAIL template is instantiated, it will cause the 
Ihfelantiatfon Of an ANIMAL Satom which I hen at! aches ifoelf to the Wx, and Ihrough which 
re ' tFSFK&! Jik * 0r SANEMAL) may bo evaluated successluily In this way * 

par.icufor ta.| become* connected to the pnenl knowledge aoout animals and animal tails that 
is hold an the Smdesc. It it interesting how a strong physical properly like cohesivenes 
° ba reflected in the data*tructure at an indexing strategy. 


;s comes 


interfacing tu'rfi (he irjtnjr-^ce 

Lteto-e we discuss Ihe ford symbal-mappirg problem, that of indexing for 




FIGURE 12. The Sami and Sspasar are ui&d to interpret an adjunct relation within a space 
Irsna. So lha f gyres above, the Saxis is maintained in lha monkey’s torso while the Sspajar 
as moved onto lha- various adjunct-s Qt the torso. In (a), the Sjpasar in pound to ‘he axis aF 
the rmonkey’s heatf. jo {bi and (c^ it is bound, to axes tor each arm; an (c> and (d3, to asrct tpr 
the le£$, and in (e), tC the £?i! tiVt sen how each pair wcuit appear tram a particular 
orientalism. 









FIGURE 13, The rndnkey shewn in several space frame Orientations. Each image is produced 
by mapping out all of the monkey’s instantiated axes while keeping the ip ace frame l|**d. 

Sr eyim ^ ' he rtionkey’s nose Is found by slaving, *lth the Sskij on She monkey axis and 
local ing the torso aifis with the Sspasar, I hen the §b.x*s can be set to Ihe &pasar T s position, 

fiifing if on the torso, and the neck can be located by Ihe Sspaaar using the torso-neck 

iidiu.-xt rdtton. This leap frog process continues until the Ssoasar ■* on Ihe monkey s nose. 
When the spate frame is rolled, wc must reconstruct the entire image. An important 
corollary to this is that we only construct IhOse details we need and Ihose only when they 

arfr needeclr tn lb - * nd t'CJ * sianding monkey is rctaled about Svc/Eical in 45 degree 

slept. In (dj p {o) and CO the monkey is Firsl tilled backward 4E degrees and then rotated 1 as 
above. 










recognition,, wt need an understanding of how the pieces oF machinery decor bed -lq. fpr wGrh 
tbgr-lher. Lei os tnerefCre assume th>| 9 3-D mode. has already been selected, and observe 
h&w it inlerfaces with ihe image-space processor. 

Suppose that Ihe commands {INSTANTIATE MONKEY) and (ENTER-[MAGE-SPM£ 
SW&NKEY) are executed, The effect Is !b instantiate a Satom from the monkey template, 16 
jelivilla il i.wh.ch instantiates Ihe major peris oF a monkey}, and to sel up a space-frame fpr 
,he monkey. The San is a bound 1o Ihe monkey -atom, and Ihe Sspasar is not yet assigned 

By executing 'ICHAN^E-SAXIS STQflSO), and then inters ruling, the adjunct 
rclp.ions- between Ihe torso and trie animal's- tombs, the Sspasar may be placed in any of Iha 
pactions SJiOiyn in figure 12 , 0 y roleLing Fhe space-frame aboul the four allowed axes, Iha 
stick-monkey may be Totaled tb ariy Orion lotion {figure 33 >. 

It is ImoorFanl Ip remember Eha.1 in praclise, Only two vectors (e.g. Ihe Slersd 
and Ihe Stlailj are rcpresEnled it any otc moment. When Ihe Sspasar is rebound to another 
1 imb K record of the predicted appearance of (ne previous one diiappoars, and only the adjunct 
relation that was read off the Ssoassr remains in the 3-D model. Tine JJdAGEL properly of the 
tail’s SatOni will still point t 6 Ihe image element in question, and if llna moves, flags ■will have 
to be sel 10 warn, Ihat the former adjunct relation my now be invalid. We can reasonably 
suppose Ihat in real life, [KAhGEL handings in|q & 3’D model tan be maintained even while the 
S&pasar is bound ebewha-e, because relatively l&w-tevet tracking, algorithms suffice td fallow 
a moving Hem in an image {Chien ft Jones 1975, Soeckert 1375). 


IltdtiiHff for recognition 

Wo have seen how the 1heo r y’s mechanisms run in an isolated slate., and wa 
turn now to the relation balween "hose mechanisms and the image during recognition. The 
interesting point raised by pipe-cleaner animats (figure I) in-the context of the present 
theory is that brie "as 10 use a 3-0 descriplion from Ihe bjfab?se before the image-space 
processor can, be runr but one might think Ehal once 1 3-0 description has been selected, 
•rocOgnitioh has jrt some sense ulready taken plE-ce. In a full image, one might argue that 
Other dues suffice to se ed Ine appropriate 3-D model, bui in the pipe—cleaner model there 
are no other dues. The traditional AJ. answer to this dilemma >e to hypothesise and lost, 
using error information in some kind of "difference-directed memory" 10 move to a better 
hypplhesis (MinsKy {Fahlman) 1375). This strategy violates Ihe principle of leas! commit men' 
on which much of our vision syslem is built. Fortunately,, [t is possible even at Ites Isle stage 
to design Ihe diagnostic system $0 that il obeys Ihe principle of least commitment. This may 
be accomplished by using a 'general" animal 3-!/ mooel (possibly several - large, medium and 
snea -animal), whose axes S-re only roughly correct, and which either hac no functional 
semanlics painter ilseif, or Only a weak one. Using a general model, good GStiineles can 
usually be made OF the aclusl ItngEhs, inclinations ano gurdle-anglas present. By accessing the 
ir.de k using this rew information, tne correct 3-D model and its functional semantics point or 
can then be recovered When Ihe correct 3-0 model is evenlially found, ihe only visible 
charge fo ;he top-level organizing SatOm Occurs in tne value of its 3CLA5E property, ]n som® 
sense l> s cnange cbnshtules the act ol recognition, because aFler it has occurred, references 


will automatic^ ly &va>u3lE< in Id the "recoErnned" ! emplane. 

figure % thawed a typ cal dataslructure Shit mighl have been delivered by the 
lower pa T ts 01 t’ie vision system, ,1 cons is Is oi s set Of axes, -cc rr p „ Iv J by segment mg a form 
(Marr & Vatart, 10 appear), bound le each oF which >s a property list (possibly'null},. This 
properly-list may for example describe the shape of She generalized cylinder whose axis this 
is, and possibly icme descriptors of its surface lexlure. In > pipe-cleaner animal these 
properties ere irrelevant. The first problem is to access an appropriate 3-D modul from the 
database, There are various parameters by which Ihese models may be indexed. 
Connectivity is not destroyed by perspecllve traflilOrMatiorvE, nor are numbers like the 
fractional distance down one axis nl which anatheir axis connticls to it. EouiriouS cornTjecIrvitiss 
can of course be introduced if one axis crosses in Jronl of a not her, and if the reason is not 
roedgn sed lower down; but oxislm.g corned ions cannot be destroyed, only obscured. Hence 
in Order td use the connectivity information, when measuring which database items bast match 
a given configuration set, unaxplamed errors of omission 5re [rested much m»rc seriously 
than unoxpla red errors of edmimss On, 

The second sort of information is girdle-ancles, inclinations, and the relallvo 
lengths of axes.. [1 is easier So take advantage OF these laler on, when thu image-space 
processor has delivered at least par ha results about She three-dimensional orient el ion 
relative tb the viewer Out il is possible to do something with them early on, This tomes 
afceut thrdugh weak, gr’Oss dues, “or example Verticals" in the image are pflen close to 
verticals in real ife, and if the apparent length of a "nook" exceeds the apparent length ot a 
leg, and if both are quite l^r'S.e, the image is likely to be a grafle. In PthEf words, lower 
bounds can oiten be inferred, and are spmdimos useful Another important type of clue 
•concerns major differences in the girdle-angles pf two axes lhai are connected to a common 
one, -or example, jhe ntcl- and the tail o!len saint in very different directions - one up and 
one down - and this obvious difference can bs seen without a sophisticated 3-0 analysis. In 
a pipe-cleaner animal, this very rough difference can heip to determine which erd of the 
animal is which. 

Bearing these considerations in mind, we see that indexing clues cam be divided 
into Iwc kinds; those that can be used before Ihe image-}pace processor has been celled into 
piey, and finer clues thal require at least a preliminary guess at the 3-D configura.trpn bbiore 
.ine-y become sufficiently reliable. The former category includes connectivity, fractional 
lengths on one axis, $omg comparisons between the lengths oF different axes, very rough 
relative G^dle-angie compara.so.nE belween iwo aKet that COnrecl to a cpmmon One, texture 
pr-d rough shape information, and possibly genera; information like the number of axes that 
Sre prohabiy horiion.tal, vertical, or neither, Such information in fad provides a CU r prisirig|y 
r ich body with which Id go fo the indexer, and fh s is pari of the reason for our opinion that 
siraight’D'ward recognition (recovery of tne functional semantics pointer} is considerably 
over-determined in a natural system, "he second category includes much finer indexing pn 
Ihe relative lenglhs of different axes end Ihe r 3-D relahon to one another. 

We can now fallow the course OF tne analysis that lakes place when the Sytlbm 
is presented with the dalaslructure sha.wn n figure 2. Firslly, the connectivity OF the exes is 
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. discovered, and coded resdy i Or accessing ihs ind«*r. Nevatia (1974} also described 
aeetSsinj an ind« using such connectivity information. The connectivity and the distance 
down each «rs al wh,ch the olher connects to it are bolh use* Each join is characterized 
using an overlapping hash-butcel system that makes adequate allowance (or errors of 
measurement,. At the sane lime, She axis (or ares) with the oioeI connetlfaps are noted, and 
tney form a possibilities list far the p'lncipat axis- of the structure. In this example, (her& is 
no ambiguity, because She farac axis has many more conned iC ns Share any otnBr, 

The tall In tnt indexer returns a general ANIMAL 3-D model, together with a 
list of sevoral specie animal?, ]n r fla | life, it « s iifc s 3y lhal auxiliary information would suffice 
td narrow this set of possibilities oown 1* a E.ngie candidate, but her? we t*fc* the more 
difficult course of using fao general ANIMAL model. This 3-D model is instantiated and 
activated, ard the SatOm for the torso is identified with the principal axis in fho image. This 
identification! is not yet complete, betpusp we have yet to resolve its polarity {i.«_, which end 
of the animal corresponds Ip the head, and which to the I dir). 

The pG arfly or the torso is over-determined in this example, ]! could be 
recovered from the tail-down tod neck-up COmbir-alicn; or From !he rough shape-descriptor? 
that wore happily included \n the problem statement Once the polarity o f the torso is 
determined, the Syslera proceeds fa re-express !he image elements in the form (origin, 
increment), rather trmn the form (endl, endZ? that was originally given, because Shat is the 
farm needed tor matching with the image-space processor, and because It nbw has enough 
inlormstfan fa be able to do Shis correctly, The image elements are then bound into the 
^MAGEL properties of the 5 atoms for Ihe remaining parfs of the ANIMAL- This binding, 
includes backpointers, so Eh*t (for example) were one leg to mCwe and this fact noticed by 
.□w-ievaJ routines, such routines could interrupt the higher strudures directly ralher than by 
having Jo initiate a top-down search far Ihe [MAGEL that changed. 

The result of this process Is the datastructure ahOwn in figure 14. The system 
now instantiates * space-frame far the antm»|, ard is ready to commence the relaxation 
process that will result in Ihe representation of its three-dimensional disposition. 

(/ishXmJ r-'ipifx 

In reai lie, initial access Id a suitable S-C model may be more difficult than this, 
because the axes that emerge from segmenting a IwO-dinensfarsai Form can differ in an 
important way Ircm the axes that are natural for the 3-D model. One circumstance in which 
this can happen is when an impdrfant of an oojee! pOmls diredly toward? Jh c viewer. 
For exam.plc, the side view of a butkel segment? naturally into a generalized Cylinder 
descriplion in which the bucks! is represented as a Elite of a core, end the axis is vertical, 3f 
ore locks at the bucket from above, ore essentially sees two circles jpirwcf by the sloping 
sides. The .principal axis of Ihe buckel appears as a pain! trom (his perspective. The same 
phenomenon is exhibited by the im*g B OF a fang, Fhjp cCme wnose axis points nearly direclly 
away from the viewer, Afac, when a vujwgd object is very close, peculiar distortions cart 
occur in Ihe revive sizes in iho image of 11$ parls. 

In order fa access the co-Tect 3-D model despite these obfuscations, sOrne idea 


of depth h, s |q be introduced into Ihe analys,* htf™ addressing !h t 3- D model index can b* 
successful, Jn Ihe case qF [he bvthd example, some proems has to realize that [he iwo 
circles might o* separated in deplh, inrf that if they are, Ihey could he separated by 3 
considerable dislancm The clues that tig**! ihis are ol| B( v nuance of shadow end h. e hl, e hU 
ind Ih* leads us 1o expect [hat much of She analysis of l, e htin E ard shadow can influence Ihe 
processing at **adly this slags of retO B nilicn. We think of the compulahons thal take place 
Here as deploying the tapasir to construe! fro^i She i ma ge e-primary 3-D ™del Shat consists 
at first of an .asis m dcplh whose urcumscribm B surlace is bounded by the two visibhs drclej, 
anti to Which extra details - like hollowness, Ihe closure of One end of this surface by an 
Orlhogonjl plane, and pdsslbiy She addition ol a cross-strut to account fdr Ihe handle - are 
added. At some point during Ihe cOnstrudipn of this description, the indexer is suecesslul at 
finding a rnifch with a buexal 3-D model! in She diabase. WJ G do no! al present understand 
this any mn« precisely, hut we have ti» feeling that one might have to abandon the pr-n C ip,l 
Of least com,nti|moot here in favour of some kind of hypOthosije-and-test straiggy. |f in 
"unusual view" fcficOmes * COnmDn view, it would pecome profitable So index the appropriate 
3-0 mDdel Ihe special features thal Obtain Ipr that view. 

f [nlerejtingly, Warrington & Taybr <LS73) lourd that patients with fee ions in 
tho .right psnetaf lobe were greatly impaired when confronted with unusual views of objects. 
Such patients, who can recognize [he picture cF a bwthol in sde view, ere unable to recognise 
the Same huckel when viewed from above; and even deny that Ihe btler could be a bucket 
when informed that it is. The aulhors cpmmcnled that i/ntOnventiDnal lightmg was as effedive 
as an unusual perspective in di*turbin E the performance cd such palienls, and they su££esEed 
that this ar,^5 because the more stra.ghIFarward 2-D Features are absent in these situations, 

' Tnese findings were recently confirmed by Carey [personal communicalion} who agreed that 
unususl views" usually correspond to views whe'? an imporlanl axis is foreshorfened. 


f7nj"ara(jpjl 

With the imsge element* properly bound 1o a 3-0 model, we can begin the 
relaxalmn process. This is an incremental activity of adjusting Ihe Saxis’ end Sspasar's 
orientations, guided by constraints c<?n|jibuted by the image, |hc 3-D model, er-d external 
iniluences such as gravily. The Object is to find the cerrect 3-D orientation which, will allow 
us to calculate the Irue lengths oi the axes 3»d their relative dispositions in space. This 
informal ion tan th?n bir used Id access a more Specific 3-D mlerpret alien of tne image. 

Ihe constraints we nave to work with are varied* and we have been finding 
new ones quite regularly, .UosL of them etmlrihuta information that reslricts to B disposition of 
one avis rotative tp another, SOmetiires in rather complex ways. The major problem in Ihe 
relaxation tag* seems It) be in combining fhtrse incomplete clues Id deduce Ihe 3-D mOdel T s 
actual oriental ion. The Image-spate processor Is currently the major fotus in Ihe relation 
theory that is developing around this problem. We I cel that this processor can be used 
effectively as a dynamic model of space, where Use taxis and Sspasar are used much as a child 
plays w.th blocks to see how they can G q together given the additional constraint he want* 
td impose on the conjuration. En Our case, the W and tapasar are h„|d togelher « rigidly 


m i& necessary to maintain whatever adjunct information we have available, and Ihe whole 
configuration (E pushes abound unlil its, projection onlp the image piang most closely matches 
.he we am Irying to interpret. Knowledge af these conslraints and of methods for 

applying then wifi bo represented pr«edurally in ihe relaxation processor currently bein ft 
developed. 5n the discussion that Follows, w® outline so/n* Of Ihe or*Nation cwlrafots 
available to Ihe relaxation processor and haw wo plan to apply (hen. 

The firsL constraints on an axis’ orientation are calculated directly from the 
image. The ima 3 el gives us the O^Bnlation and lenglh or its projection onto the image plane, 
SO She only additional information neeced, to compute the axis' orientation,, is Ihe inclination 
tnis axis makes with the viewer's line of sight. Sometimes this nog* can be estimated directly 
trOnr. the imagei's shape oescriplion. For example if the axis «g known 1o be cylindrical and the 
intermedialg visual processor has supplied a shape description sufficient to calculate the 
Idea I ion of its perspective vanishing point, the axis will bo parallel with » line From the viewer 
to this vanishing point ktore often, Ihe images shape property and our confidence in what it 
•snould be wEir not allow more than a very approximate guess at Ihe vanishing point’s location, 
so other clues are needed Id constrain the axis further. 

The influence that gravity and Other external factors hev# on the distribution Of 
imagel Orientations provider acdilional information about individual imaged. Fcr example if an 
imigsl is close to vertical in the image plane, there is a- high probability that the 
cor re s cording axis is vertical In space as welP. This due doe* not require knowing anything 
about Ihe nature of the observed axis to he used, but it attracts most confidence when She 
axis is "Known lo be vertical in its normal state (e.g. a horse's teg). Another clue has to do 
witn the ground, [( is often flat, and when it supports the objects we view. It serves as a 
very strong constraint on the likely wientalibns of som* eF their axes. Consider for example 
o cow's torsos it is very difficult ip/ the cow to hold it out of the ground plane withoi/l 
bonding hi* legs Out of pargl el. 

When individual imagels fall to provide enough information, we can IcoX. at 
several together. ]n the example with Ihe cow, it was important that its iegs were oaraltel for 
insuring ■ hot hrs torso wa* para.ll.sl to th* ground. Pst allcliEm is a very nice property in that 
She jmsgels that correspond 1* parallel axes are also parallel. In animals, Ihe legs fire quile 
oflen close to parallel and this is a powerful means of disambiguetirig them Irc-m other 
adjuncts Of the torso. Accidental alignments can produce parallel imagels where the axie* ore 
nbl parallel, but rarely will this happen for more than two axes at a time, and even this 
possibility is infrequent enough to justify paying sUanlioo fo *ny parallel axes that are 
prescnl in the image. 


When one axis has been fixed in space, the disposition* pF its adjunct axes are 
heavily constrained and are olten determined uniquely.' It is a simple matter to rotate the 
S-spasar about the krewr. axis until its projection matches, the image I OJ the unknown axis. 
■ •-.is is whit was done above wnen Ihe animal's Eorgp was rotated about the vertical until Its 
projection matched Ihe torso amagel. 

Tinally if no axis is determined but two adjurvei relations ate known between 
three nonparaliel axes, tnen (except J*r a Few pathological cases) the dispositions of these 


flues ara determined. The n.-gr-spacfl proceed' can be used to discover Ihesa dispositions 
thrd'j^h a process dF experimifltlng wJ(h various orientfllionS Of Ihese three axes (\wq a 
time} atlempting lo maintain I he adjunct relalions while minimizir® the discrepancy pelweon. 
tne image!* and the projected axes. 

Let us now return to the animal mage we have been processing, M was left fit 
the point whore its 3-D model had been activjled and enlere-d into the image^spsee with the 
imfijels correctly b&und Id its axes, 31 has no verlitai axes, but M is known to be Or> the 
ground and its torso is very likely to be hormonal, T.ne imagel shape information is not good 
enough in calculate any vanishing paints, and only the adjunct? bclween She torso and Ihe 
logs are reasonably certain amce the otrcrs vary too mutr. frdrn one animat lo another. First 
the Is Is SB I to Svertieal and the Sspasar it placed on the animal 1 * torsp adjunct. Thus the 
Sspasar is forced to be perpendicular to tn* gravitalidnal vertkflL The Sspasar h s projection 
must oe aligned with the torso image I so that Ihe tree eno of the Sspasar is on the tail side of 
the imagel. Once this is done, the Sspasar Is Oriented as Ihe animal’s torsp is. Next the animal 
3-D model must be rOtaled about its torsp axis so thal il corresponds with the image. The 
torso-leg adjunct is the most reliable, so the Saxis it nOved to She S 5 pasar l s current position 
on. the torso and the Sspasar is put pnlo one ol the Jogs. Again Ihe Sspasar is rdlaled about 
the Savis until its projection Onto the image plane matches the corresponding leg image!. At 
Ihrs poinl the animal's Orientation it determ.-ned. The next task is Ip measure the remaining 
adjunct reiaSa ns. A very important constraint for do ng I his comes. from the tact thfit fin 
anjmal's axbs tend to be parallel to Or.? plane. This moans that the girdle angles OF the 
adjuncts are the tame modulo i&O degrees. So the inclination of the animal’s neck Can ba 
found by putting the Sspasar on the LbrsD-neeJi adjunct 3 rd varying Ihe inclination angle until 
the Stpasar’s projection lines up with 1h<t nock image!, Finally Ihe relative lengths OF the axes 
are measured by selling the lyaxis and Sspasar onto Iwo axes at a lime and adjuring, Iheir 
lengths until they jinl cover Ihe corresponding imaged Figure 35 sketches Ihe relaxation 
sequence described above. 

.Wininifiijl g tlir. Geunjiltxiiy a/ rlir inmij^rr-j.fHSce prneriipr 
In our account ol th.s Ihepry, the image-spate processor has beon pared down 
to almost its minimal implement at ion; only si* directions are represented, only )w« are active 
and (as a corollary} only Jour relation! allowed. The original motivation For this was to sec 
how simo e a processor -could supper! the mechanisms thal were required. Provided that 
coordioflle* are chosen as described in figure 9, the computations reeded to supp&rl the 
image-space processor are s I raight Forward, and if is unlikely that finding an economical neural 
represprfalibn for this pert oF Ihe theory will prove very diFficull. 

ihe main characteristic of 5 minimal implementation is lhat il shoo id contain 
only One "rot liable" element, a no rather lew passively slored dir eel ions. In such an 
implementation, a conslruct in the space-frame would nol survive rotation of that frame, and 
would nave to be rccdnslructed, Thrs is quite a si rung signature oF a minimal impicmBnlationL 
lh - ability le hold dirpclipns in passive storage would sometimes he useiul, though some cere 
is necessary wnen deciding whether I hey are cliSI rehab e. 



FIGURE 15. Once a 30-mOdel is selected For the object, it must be elated lo She disposilion of 
the image so that tne object's axis lengths and adjunct J 1 e I a? i-Q ns can be measured. This 
process is carried out dp. t-n or ample shown in ligure 14. (A) shows ihc 3D-mPdeMn ht 
standard or 10 nS a Li bn. To gel the orientation *F 'he lorso, an assumption is made that it is 
parallel to Ihc ground. Tne Saris is stt to Sverlifal and Ihe Ssoassr in {0} is set perpchCJicuiar 
to it in t.'ie image o ane. The Sspassr :S then rotaled about Ihe Savis Lniil its projection aligns 
with Ihe torso imagel in {□) establishing the diratlion O' the lorso axis, Went the 3D-nieoel's 
r-o Fa I i-On shout this axis must oc Discovered, so Ihe Saris is moved to the Sspasar’s position on 
Ihe torso and Ihc 4spasar « placed on. Ihe tor ?Q-leg edjuncl and related about the S-acis until 
its projeclion a igns with the eg imagel (C.! and TE), At this point ihe- 30-jnodel is correclly 
orien'etf. th (") and (G) Ihe ind.in.Jlion angle OF Ihe torso-neck adjunct reiJlipn is. moasuro-d by 
placing the Ssoasar on the neck axis and aoiusling the inclination until its projection aligns 
wilh Ihe neck image!. En (Hi and IT] Hit engths *1 the lorso and neck a re compared by 
I eng t.-iing the £sp*.=jr irnhi i!s projection i$ the same 55 the nock irragei and shortening tine 
Savis to match Ihe lorso imagd- 
















Otherwise, the slrueture o1 the theory turn* Fess on the constraint of minimal 
complexity Ihari t,n e might a t first sight Krpect. For exsmpie, one import characteristic OF 
our representations it its ability to move tluentiy fro™ a cua-se One-a* is description of a 
whole animal 10 a very Sine dettriplion pf one anal! part. By removing the minimality 
constraint, one could easily design a machine capable or mamlainihg a Fine de«rlplihn of all 
parts of the anima.. simultaneously, but wh»l would be the point? The Fact ia lh*t tor many 
purposes, a complete 3-D reiPAjlruction of p physical object is rvo bettor than the Original 
object. How do you answer in which direction a horse is pointing it tin, part «f your 
feprosenlalion contains sbmelhirg like a lorsp ads? Many of the diagnostics and constraints 
that *ppiy during recognitor and re lex a I ion concern the overall disposition Pf the whole 
animal, no! very local details (1 hough they too can he important). The same ia true of She 
questions | ha I one **ks Of an imogc in real life, ]n a very luxurious impieunontaLiDn, one might 
maintain complete descriptions at all levels of del oil simultaneously, but the resources needed 
to do so could probably be better employed in other v*yt. Provided the! a simple 
imp.ementation can compute answers to the important questions reasonably quickly, there it 
r.p reason to use a complex impleirentalioA, 


i&iMMitMFOii 

The discussion 4 ails naturally into two pa r ts, one concerning the specific 3-D 
representation theory, and the otn*r dealing with the broader issues raised by the 
interpreter tor lhal theory. 


I- rcj-jrc jemc iren theory 
There are live main point* to out theory, Thoy are: 

■..1 ne 3-D disposition of an object is reprewnled primarily by a stick-figure configuration, 
whare each stick stands for one Or more ases in the object's generalized cylinder 
representation 


(2> This COntigurstitin is described by a loosely hiersrchieal asserlicnal database, called » 3-D 
mode . Use OF this database is extremely tree and FSerrible, and it can supp&rl levels of 
detention Ehat cover Ihe spectrum from very coarse to vary fine de-tail. the principle 
Of graceful degrad-alicm.) 

C3.:- In Qti tr fo be useful, this database has to be Interpreted through an (essentially) 
cir.elc.gue muc-hanism, tailed the image-space processor. In its minimal implementation, (his 
processor maintains a re press ntat ion pf two directions in a space-I rams, in addition to the 
gravilatio-pal vertical, 

f4) The image-space processor’s instruction sel it small. ]L s mosl impOrlart features g re : 

(a) the ability to interpret an adjunct relation between ]h B &a*is and tin Sspasar, pnef 

(b: the ability to execute tour frame relations (about Sup, Sfront, ShOrizont*!. and 
Sverlical). 


(5) The image-space processor e*r, deliver intormatiDn about the lengths and Orientations of 
the appearance Of the Swis and fcpttar. These help {he system to relate its modal into the 
CCTree. 3-D disposit.pr rglat ve to ihe viewer, 


3t nes not Hcapod ogr itffct lhis Ihrery may ij| urninate cpr1ain recent 
(,r,dir ^ ,n Pathology, Shtpa r d * hazier U971) c re a fed a set of j ma&B5 by 

roUtsnu and collecting simple objacls made ol cube* (Jigure 161 They found that the lime 

tflkfin tD dfrtcd * whethpr f * a £UCft w*rp OF identical Objects, ralher than objects 'hat 

diHered oy a retl^licn, vuisd linearly with the angle thresh which one object must be 
roleied in 3-space to became lligned with Ihe olher. Th-e finding revived inl 0 r BS t in "mental 
imagery" 3nd in analogue processes in perception (Cooper R Shepard (19731 Meliter £ 
Shepard (1974), Shepard (1375)X [n addition Kosslyn (1975) has polished evidence for an 

analogue component to Ihe processes that interpret mairdy two-dlwraiOrol structures, tike 
feces and maps. 

The significance of such eapcnmenls is controversial {but not Ihe results), Jt is 
a commonplace that an Observer's description Of what be does not understand is often 
misleading, because the concepts through which he attempt* to capture the experience are 
inadequate. Because of this, 'Wnlalisf experiments end especially the introspective reports 
that accompany them, are rightly carded with suspicion, All hough it is widely recognized 
thaS a complete theory ol mental processes will eventually have to explain Ihe findings and 
the subjective Experiences that accompany them, we are probably not alone fn fed-ng Ihat 
one should not rely on Iheir help |p construct a theory, because of (-he possibility that such 

reports will become accurate ohly after the observer underalands ihe processes that ara 
givmg rise to them, 

Anolher reason for ihe controversy seems to have been ihe difficulty an seeing 
how an "analogue" process could benefit She compulations that underlie perception and 
recognition, We beiipve that She present theory shows 4 way in which such a mechanism 
could be useful, aiShou&h we recognize ihat this' may not be the way in which we Fn fact use 
il- In Order to help decide 1his r one probably has is study th B neural Implementation of the 
mechanisms Ihat we described, in the hope of making lestable predictions about single unit 
responses, Since this is a major undertaking one needs seme evidence Ihat She Iheory is 
maeed a like.y Candida*. There are several points Ihat appear to us to conslitute reenable 
grounds lor believing Ihe theory to be a good candidate for a psychological theory. They are; 
'.]) Fipe^cleaner anitnak are almOsI as easily recogniiable as are line-drawings OF animals 
O^pite their very abstracl relation to Ihe Original This would not be uprising Ef pip*-' 
Clearer animals were in some sense extracted tram the image during Ihe norms; course of its 
interpretation £as ogr theory a$»r1s^ but it would be surprising tf not. The computational 
advantages ol so doing need rc emphasis. 

(2) The loosely hierarchical structure of p y r 3-D models has many computational advantage 
that are almost bound la br shared by the psychological representation, even if the 
psychological representations a* Otherwise very different, One can probably rule out any 

syilein Ihaf cannot restrict Ihe levei ol descriptive detail al any point 10 only that currently 
fleeced. 

0) An important part of the theory is Ihe minimal nalure ot the imago-space processor. A 
consequence of this Is Ihat after executing a roHaliCn, Ihe "image" of the 3-D model has Id be 
reconstructed in the .new space- Frame, as opposed 10 being constructed Once and them being 
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rested as a whole In Ihe rmsge space. The tallowing "menial experamenls" will convey the 
intuit ions behind this to Ihe reader, We have confined them to [he discussion, became we do 

not regard such evidence as admissible in Ihe debate about the psychological correctness of 
Ihe theory. 

(a|- Imagine a hor&e. [For most people. it is facing either 1*11 or right, which we 

inlerpret as the inilial space-frame configure on.] 

ibj Imagine rotating 'he horse 9G degrees aboul ils torso. 

(c) Where is the neck pointing? [Mci=1 people can answer this easily,] 

(dj Mow snagin* a new hors* rn the sterling configuration, only this particular horse 
has his legs slued onto the top nl his back, pointing upwards. His hend end neck ar« in 
the usual position. How rotate the horse !SQ degrees aho u 1 hia torso. Where are hi* 
legs pointing? Where ie the neck polling? [W* lind that people eilhor "leave the 
peck behind"' in Eh s rotation. and have lo reconstruct i-l aflcrwsrdsj or they find 
themselves inlerrupliog Ihe reconstruction of [he legs *Mar the r of alien because this 
reconstruction differs from the normal one which they had Iby habit) initialed. W* 
think of the actual rotation, as having taker, place with the jfexk bound [o Ihe torso, 
end She Sspasar either to an "undercarriage* Satom, or ID a Sslom for the forelegs or 
the hindlegs, or to * Sal Dm for Ihe *frtk, or head * r*ck -("bust"),] 

(d) The numbor OF possible 'ft a I ions in our model is small, only four being allowed. [This 
seems So be 1r u e Of mental rol^Ons. Far example, imagine a normal horse Once again, whose 
t *' 1 15 abDut 20 de ® re « s 10 ' he tf *'tlcal. It seems lo be si raighl forward to execute mental 
rotations about Ihe Ihree principal s>:« {|h* horse’s up, fronl. and horizon talk hut not about’ 
arbitrary directions. For example, Iry rotating it aboul its tail One either Fails, or h ?5 to 
resort to a special stratagem. [f however one imagines Ihe hors* standing on a 20 degrep 
slope so that th* tail actually falls down the gravitational vertical, it becomes easy to imagine 
rotating it about the verlical - i,e, about the same axis relative to the horse that was 
previously so difficult,] 

•.z) he 3-D inoael is .QOspiy hierarchical, [in the previous example., one mighl havE thought 
that one could move the Sa K is 16 the tail, the Kspasir to the torso, ard then rolalo the horse 
This possibility would be excluded it Ihe tail SatOm contained no adjunct relation tar tho torso. 
Adjunct relations are not symmetric, and Ihis is wh'al in our theory produces the directional 
property Of the hierarchy.] 

(6) Ther* is^ a generaf agreement between &ur expect»ttans and tins results surveyed by 
Shepard (1&75&. Only-one of the iind.ngs (item 14 page 1Q0) is unexpected, ft comet from 
Cooper £ Shepard (19730 condition 0), who showed that advance information giv.ng the 
Orientation but not to* identity of Ihe object Id be printed is not sufficient jp enable 
subjects to prepare for it. One might have exported that subjects could rotate their Sjpasar 
to the appropriate Orientation, ancr leave it I here to be bound to a 3-D model When the image 
was presented. Cr* Order To incorporate this finding, w C would need to assume {tar example) 

' h a I Cspasar machinery cannot be run unless bound 10 a 3-D model (even if only Of an 
axrowjj and that whenever the Sspasar is rebound to a new 3-D mode*. the image-space 
processor is rescl. there are some other ground* for wanting this. The space-frama in the 


imuse-space orator needs mor* than one direction to define il, and trying In eo-nsWE a 
sp*ee-fr*me 'bund a given vector can lesd la problems if the 3-D mod*i j E r>t>l simplt. 
Secondly, in the real world, one rarely sees Wo object* at the same ppint art Ihe field or view, 
herefore, to change to a new 3-0 model almost always requires a chan Etf in Ihe direction of 
&eze, Fn Orcer to co moons ale for this in a minimal implBmenlalit)^ Ihe &a*k and ^spasar 
wo old have to be set to *** E in the staling frame, in order io carry oul the pr.mary rotations 
thal allow for the angle Of gaze, these arguments are however weaker then the Ar E unnenl& 
that support the, re si of the Iheory. 

The reader can amuse himj^l' by construing menial rotation problems, and by 
dovisirE strategies lo answer q^asliars on which he fails the First lime, By noticing and 
exploiting extra relations between the parts Of a structure, one can quickly become much 
more versatile at answering queslions about ihe appearance of an object when rotated in a 
new way. JE appears 10 us I hat Ihe mechanisms contained in our theory can account (or mO^i 
01 the experiences that One has when imagining such things, and Ihis is partly why we firvd 
the theory interesting. But we are perfectly aware that this kind c! evidence cjnnol establish 
that the theory provides » cOrreel or even en adeque1e model ol Ibis component ot our 
ptrccpluai fatuities. What w c do ciam for the theory is thal the computational Facilities it 
describes are useful for recognizing and representing the disposition of an object in three- 
dimensional space. 


*! .H'rr :i u'or ;.^jtur^ 

]t IS no accident that the term "frame - tip the sense of Minsky 1975} has not 
appeared in Shis article. We have been earelul lo use only technical terms (Salon, template) 
that have a precise meaning m Our Iheory and are supported by a working program, 
Nevenhele&E &Dme of our ideas have been influenced (positively pr negatively} fcy MinshyV 
extraordinarily stimulating (and fruslrating) article, and we must altempt to reSale our work to 
the ideas oF frame theory. 


At the very top level, Minsky musl surely be correct when he observed I hat 
tne "chunks" of r«sOran 0 , language, memory and perceptio* Ought to b c larger and more 
:.l r u-. .u^ed than moat freorjes in urtihcial in1elii E ence and psycholcigy allowed, failure So 
rtilito thi* fed It absurd attempts tq 'prove" from prodkate-c*lcuius-like "axioms" the 
correctness” of strategies For passing between two rooms Or circumnsvigaling an obstacle, 
Reaction, to Chat Irne of thought led 10 She procedural embedding of knowledge, to now 
languages like PLANNER (Hewitt 1955} and I hence to CDNNJVER (Sussmar, & McDermott 1972) 
- a valuable experiment that has run sufficiently long now for Ihe results to be In, and I hey 
* r o conciusiveiy negative {though with much positive spin-pfl). 

Excepl at th.s very 'general level. It eb not clear thal tihe Ideas of Frame theory 
axe relevant to She specific problems oF visual percoplion. Minsky himseif h# 5 made r,o claims 
k haF frames are relevant 10 early vtsuaf inlormation processing. The grounds foe extending 
this conclusion to later processing are as follows; 

(i) CeMalt pAeiwmeu*, It i* probably incorrect to think pi firing theory as being important 
I Or perceptual Gestalt phenomena. The Kanizsa triangle, and ■sun' illusions like figure 9a 


Mifr U975 a) are r.cH caused fcy the dejc*rtding influences of a high-level "frame-like" 
°'B* niMtl<>n Qi 1he ?™PU fhe/ are due tp general-purpci* interred!ale-letel grbuping 
prpeess.es lhat act bn the primal Gkelch anj together perform much df figure-ground 
separation (eee also Warring tan & Tayldf 1973 p. 15*}. Recenl work by £. UlllMn (personal 
communis alien) demonstrates that the same <5 true for motian vfcj« n . Figure-ground 
s spiral ion by relative motor in not caused by erosive top-down matching from a me ‘ to 
tne image. Jt 15 due a most entirely to local matching processes thal operate bn the changing 
primal sketch, and it lakes place before any description of I he separated figure is cOmpuleb, 
Similar remarks hold fpr percepts doe only to binocular disparity informal ion (Jules* 1971} ? 
and lor th* recognition. of symmetry in a figure (ivtarr 1976). 

(2) Atuhiplv nieie repremifaiL tt is dtticull Id arjue cogently against this represent alien, 
because it is at present untferdeFined - for example, are all Mews" oF a man the same in 
which the seme limbs are visible but arranged rn different pis it ions? Nevertheless, something 
df a ease again si it C&n bs made from Warrington & Taylor's ft 373) findings. The side view Df 
a oucket 4& very diffarenl from the' lop view, and both are reasonably simple. 0n B would 
expect the multiple view representaliOa lo contain them both, and (presumably) to have 
indexed both of them ]f Warrington & Taylor's lesions had randomly damaged Ihe multiple 
view representation, one would expect spm® patienls tt have tost One view, and Others, 
another, Bui Ihe fading is that all patients are imp a red on the sane view, and that for this 
and other objects (e.g, a clarinet), the los! views are precisely those Furthest removed from 
tne abjecls* generalized cylinder representations. Although the multiple vjbw representalion 
•S not absolutely incompatible with these findings, strong extra assumptions are heeded to 
incorporate them. On the other hand, they are a natural consequence in o u r theory of Ip sing 
the image-space processor. 

(3) Frame technology, The major difficulty in criticizing other aspects of fra^e theory is that 
they have not yet been made specific enough to be refutable, For example, existing 
exposilions OF Ihe theory fail tp ricFine what "frames 1 *, Terminals", "filols" end -semantic riels* 
could actually mean beyond the old ideas pi property-lists, v,t Uj?E and default* (Ilk* 
Raphael’s), »nd some sort of labelled graph structure - all of which are useful ideas but which 
are too simple to carry 1 h c load attached Ip their re! as in frame theory. We encbunlered 
problem thal ni ght be related 10 Ihe nolion of Terminals" no a 3r,ms. )F the "parts" of one of 
Our horse 3-D models are somehow construed as MtosKfon terminals On a horse '’frame", then 
One can perh spt make seine correspondence between entries in our packet indexes and the 
ierhimals of Frame theory. The analogy is flawed, because o U r packets pro based on 
reference evaluation not on matching [ a distinct wa regtrd es crucial}, and in any c,*e we 
found lhat fixed terminals proved too inflexible to be useful.. W# needed a system in which 
there were many ways of describing ihe pads of an animal, *nd the particular ones chosen 
(e.S, fdre.egs, or lell-lprcleg and righl-FOrelej, Or sometimes both) depended on the 
cnrcunitTances. The real issue, as Fjltlmn grasped early on, centers on creating this 
flexibility, which in turn places the referenca-wlmfow problem and its associated Indexing 
etrategicE firmly at the center oF ihe debate. Like Woods fJ 975 ), we Found the idea of a 
semanlic net foo vague la be useful; differential a-agnesis based bn a dif ft re nce-dire c^d 


Vi< " a ' 05 1he pfincif> '* * '* ss1 Wimitmant, Md doe* not app** r to te retevanl in the 
[yp^ of reccjin'Sion Ih^l *rp have been studying. 
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and Draw McDermott lor his perspicuous and penetralin* criticism, We emphasise !h*1 we 
CGgld not have developed the theory wilhoul the eyerie,nte of impWentma it. This in | U rn 
would not have been possible wi!hout the extensive arvd flexible comping f.cilHies that are 
- " vai,able *< 1hl? laboratory Karen Prandergast prepared Ihe drawing. Work reported herein 
r aS “ nducle<l * ! Eh - ArticiSial lnlelti E ance Laboratory a M*s*«huaett* Institute of 
I echnotojy research program Supported in par! by the Aovantcd Research Project Agency 

Of the Department of Defense and eifinilpred by the Office Of Msval Research order Conlrac! 
number WOO 14-75-0-06^3. 
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